In this document we learn how to create interactive charts with Highcharter. Simply put, we are learning how to transform tidy data into visually clear graphs. In the overall context of the workflow, this falls into the category of transforming our data into data visualisation.
{{<expand "Note: LinkedIn Learning videos" "...">}} There are references to LinkedIn Learning videos. These are complementary but not really required as the notes below are meant to be self-contained. Some students and staff would have access for free. Do not purchase access unless you are sure you don’t have access through your organisation already. {{</expand>}}
library("tidyverse")
library("highcharter")
%>%) operator to create chartsFor our examples we will use the same data from the Australian Environmental-Economic Accounts (2016), now including data from 2008-2014. The data relates to water consumption by state.
load(file = "tidy_EnvAcc_data/consumption.rdata")
consumption
## # A tibble: 48 x 3
## State year water_consumption
## <chr> <chr> <dbl>
## 1 NSW 2008–09 4555
## 2 VIC 2008–09 2951
## 3 QLD 2008–09 3341
## 4 SA 2008–09 1179
## 5 WA 2008–09 1361
## 6 TAS 2008–09 466
## 7 NT 2008–09 160
## 8 ACT 2008–09 48
## 9 NSW 2009–10 4323
## 10 VIC 2009–10 2904
## # … with 38 more rows
hchart()type argument specifies the type of visualisation we wish to createconsumption %>%
hchart(type = "bar")
hchart(), we use the function hcaes() (highcharter aesthetics)x and y for the x-axis and y-axis variables respectivelyconsumption %>%
group_by(year) %>%
mutate(consumption_total = sum(water_consumption)) %>%
hchart(type = "bar",
hcaes(x = year,
y = consumption_total))
Hover over the bar chart to see its interactivity!
Also consider the color argument of hchart() which can be used to set custom colours.
consumption %>%
group_by(year) %>%
mutate(consumption_total = sum(water_consumption)) %>%
hchart(type = "bar",
hcaes(x = year,
y = consumption_total),
color = "red")
group argument of hcaes()consumption %>%
group_by(year) %>%
mutate(consumption_total = sum(water_consumption)) %>%
ungroup() %>%
hchart(type = "bar",
hcaes(x = year,
y = water_consumption,
group = State))
hc_plotOptions() functionbar argument, which takes a list of sub-arguments as its valuestacking sub-argument for our purposesWe can set it to “stack” for a regular stacked bar chart…
consumption %>%
group_by(year) %>%
mutate(consumption_total = sum(water_consumption)) %>%
ungroup() %>%
hchart(type = "bar",
hcaes(x = year,
y = water_consumption,
group = State)) %>%
hc_plotOptions(bar = list(stacking = "stack"))
…or “percent” for a percentage breakdown!
consumption %>%
group_by(year) %>%
mutate(consumption_total = sum(water_consumption)) %>%
ungroup() %>%
hchart(type = "bar",
hcaes(x = year,
y = water_consumption,
group = State)) %>%
hc_plotOptions(bar = list(stacking = "percent"))
Also note that vertical bar charts are just column charts, so to convert between the charts we simply change “bar” to “column” where relevant:
consumption %>%
group_by(year) %>%
mutate(consumption_total = sum(water_consumption)) %>%
ungroup() %>%
hchart(type = "column",
hcaes(x = year,
y = water_consumption,
group = State)) %>%
hc_plotOptions(column = list(stacking = "percent"))
Lastly, be aware that the color argument can be vectorised for custom colours, including HTML code colours!
consumption %>%
group_by(year) %>%
mutate(consumption_total = sum(water_consumption)) %>%
ungroup() %>%
hchart(type = "column",
hcaes(x = year,
y = water_consumption,
group = State),
color = c("gold", "blue", "pink", "orange", "green", "purple", "red", "violet")) %>%
hc_plotOptions(column = list(stacking = "percent"))
hc_tooltip() function changes the information displayed when we mouse over our visualisation| Argument | Possible values | Function |
|---|---|---|
valueDecimals |
A number | Changes the number of decimal places to which our data displays |
valueSuffix |
Any string | Adds the specified string as a suffix to the data |
shared |
TRUE or FALSE |
Specifies whether the hover data is for all bars or just the one we mouse over |
Consider how each of the following arguments have modified our mouse-over display:
consumption %>%
group_by(year) %>%
mutate(consumption_total = sum(water_consumption)) %>%
ungroup() %>%
hchart(type = "bar",
hcaes(x = year,
y = water_consumption,
group = State)) %>%
hc_plotOptions(bar = list(stacking = "percent")) %>%
hc_tooltip(valueDecimals = 2,
valueSuffix = "GL",
shared = TRUE)
For our examples we will use data from the ABARES Agricultural Census of 2015-2016. The data relates to the average climate-adjusted productivity of all cropping farms between 1977 and 2015.
load("tidy_ABARES_data/farm_data.rdata")
head(farm_data, 5)
## # A tibble: 5 x 4
## year Total.factor.productivity Climate.effect Climate.adjusted.TFP
## <chr> <dbl> <dbl> <dbl>
## 1 1978 95.9 89.7 103.
## 2 1979 113. 113. 102.
## 3 1980 112. 106. 103.
## 4 1981 84.2 92.5 101.
## 5 1982 104. 105. 101.
To create a scatter chart requires a similar method to the bar chart.
hchart() function, we set the type to “scatter”hcaes() arguments x and y for the data we wish to plotfarm_data %>%
hchart(type = "scatter",
hcaes(x = Climate.effect,
y = Total.factor.productivity))
We may also use the color argument of hcaes() to colour our points by some variable
farm_data %>%
hchart(type = "scatter",
hcaes(x = Climate.effect,
y = Total.factor.productivity,
color = Climate.adjusted.TFP))
For line charts, we use the type of “line”
farm_data %>%
hchart(type = "line",
hcaes(x = year,
y = Climate.adjusted.TFP))
marker argument of hcaes()enabled sub-argument to TRUE or FALSEfarm_data %>%
hchart(type = "line",
hcaes(x = year,
y = Climate.adjusted.TFP),
marker = list(enabled = FALSE))
For bubble charts, we use the type of “bubble”. The argument size of hcaes() is used to determine which variable influences the size of the bubble.
farm_data %>%
hchart(type = "bubble",
hcaes(x = Climate.effect,
y = Total.factor.productivity,
size = Climate.adjusted.TFP))
We can also re-scale all the bubble sizes as we like:
hc_plotOptions() functionbubble argument of this functionmaxSize sub-argument of this list to be a percentage, for example “10%”
farm_data %>%
hchart(type = "bubble",
hcaes(x = Climate.effect,
y = Total.factor.productivity,
size = Climate.adjusted.TFP)) %>%
hc_plotOptions(bubble = list(maxSize = "10%"))
hc_xAxis() or hc_yAxis() funcitonstype argument can be set to “logarithmic”x0 = seq(1, 10, 0.1)
y0 = log(x0)
dataXY = cbind(x0,y0)
as_tibble(dataXY) %>%
hchart(type = "scatter",
hcaes(x = x0,
y = y0))
as_tibble(dataXY) %>%
hchart(type = "scatter",
hcaes(x = x0,
y = y0)) %>%
hc_xAxis(type = "logarithmic")
In this section we introduce a new function, highchart():
hchart()To create an interactive time series:
highchart() function, with the type argument of “stock”hc_add_series() functiondata, which we set to be our datatype, which can be “point” or “line”hcaes() argument to instruct the function which variables to plothighchart(type = "stock") %>%
hc_add_series(data = farm_data,
type = "line",
hcaes(x = year,
y = Total.factor.productivity))
We have a highly interactive time series plot, including various zoom settings, and a scroll to select particular portions of the graph for viewing.
We can additionally plot multiple time series variables on the same graph. This is done simply by piping another hc_add_series() function into the mix.
highchart(type = "stock") %>%
hc_add_series(data = farm_data,
type = "line",
hcaes(x = year,
y = Total.factor.productivity)) %>%
hc_add_series(data = farm_data,
type = "line",
hcaes(x = year,
y = Climate.adjusted.TFP)) %>%
hc_add_series(data = farm_data,
type = "line",
hcaes(x = year,
y = Climate.effect))
There is no legend to distinguish the different curves, but we can add one with the hc_legend() function and by setting enabled to TRUE.
highchart(type = "stock") %>%
hc_add_series(data = farm_data,
type = "line",
hcaes(x = year,
y = Total.factor.productivity)) %>%
hc_add_series(data = farm_data,
type = "line",
hcaes(x = year,
y = Climate.adjusted.TFP)) %>%
hc_add_series(data = farm_data,
type = "line",
hcaes(x = year,
y = Climate.effect)) %>%
hc_legend(enabled = TRUE)
However the series labels are generic and the colours are distasteful! We can fix this using the name argument of hc_add_series(), which names the series (and hence the legend). We can also specify custom colours. Overall our chart becomes much nicer.
highchart(type = "stock") %>%
hc_add_series(data = farm_data,
type = "line",
hcaes(x = year,
y = Total.factor.productivity),
name = "TFP",
color = "orange") %>%
hc_add_series(data = farm_data,
type = "line",
hcaes(x = year,
y = Climate.adjusted.TFP),
name = "Climate-Adjusted TFP",
color = "red") %>%
hc_add_series(data = farm_data,
type = "line",
hcaes(x = year,
y = Climate.effect),
name = "Climate Effect",
color = "lightblue") %>%
hc_legend(enabled = TRUE)
Treemaps are used to visualise the comparative sizes of a single quantative variable among observation. For example, if we wish to see which Australian state consumed what amount of water from 2013-14, we might use a treemap for comparison.
hchart() as usualtype to “treemap”hcaes(), name for the observation and size for the quantative variableconsumption %>%
filter(year == "2013–14") %>%
hchart(type = "treemap",
hcaes(name = State,
value = water_consumption))
We may set colours according to a variable:
colorValue argument of hcaes() and set this to be the variable to colour byhc_colorAxis() functionminColor and maxColor argumentconsumption %>%
filter(year == "2013–14") %>%
hchart(type = "treemap",
hcaes(name = State,
value = water_consumption,
colorValue = water_consumption)) %>%
hc_colorAxis(minColor = "lightblue",
maxColor = "darkblue")
Note: here we use various words for blue, but we may also use HTML colour codes such as “#000EFF”.
There are two types of shapefiles - ESRI shapefiles - the older standard for shapefiles - To use them we must have (at least) one of all of the below: - A .dbf file - A .shp file - A .shx file - GeoJson shapefiles - a newer type - To use them we only require one .json file
A good sources of global shapefiles are NaturalEarthData.com and Johan’s repository
Note that for Highcharter, our process requires that we convert our shapefiles to GeoJson. We now do an example of this.
To prepare these shapefiles we require the library “sf”.
read_sf() to read an entire directory of shapefiles and save the resultlibrary("sf")
shapefile_map <- read_sf(dsn = "shapefiles")
# Note: for file path, do not include a '/' at the end
class(shapefile_map)
## [1] "sf" "tbl_df" "tbl" "data.frame"
We have our shapes - we will mutate our shape data so that they are named by state.
shapefile_map$State <- c("NSW", "VIC", "QLD", "SA", "WA", "TAS", "NT", "ACT")
We then use the geojsonio library to convert these files.
library("geojsonio")
geojson_file <- geojson_list(shapefile_map)
class(geojson_file)
## [1] "geo_list"
We are now set to make our chart.
highchart() functiontype to “map”hc_add_series_map() functionmap argument of our geojson file
df argument of our data
joinBy argument to join the map and data
consumption14 <- consumption %>%
filter(year == "2013–14")
highchart(type = "map") %>%
hc_add_series_map(map = geojson_file,
df = consumption14,
value = "water_consumption",
joinBy = c("State", "State"))
Mouse over the chart! We observe that the label is a bit strange. We have ways around this:
name argument of hc_add_series_map() to change the “Series 1” labelhighchart(type = "map") %>%
hc_add_series_map(map = geojson_file,
df = consumption14,
value = "water_consumption",
joinBy = c("State", "State"),
name = "Water Consumption (KL)")
We can also change colours as we have seen before with hc_colorAxis()
consumption14 <- consumption %>%
filter(year == "2013–14")
highchart(type = "map") %>%
hc_add_series_map(map = geojson_file,
df = consumption14,
value = "water_consumption",
joinBy = c("State", "State"),
name = "Water Consumption (KL)") %>%
hc_colorAxis(minColor = "#C5C000", maxColor = "#434000")